KPMG | Data Engineer | 4+ YOE



Round 1 - Technical Interview

Work and Conceptual Related:

๐Ÿ”นTell me about yourself, kind of projects and tech stack used.

๐Ÿ”นHow does your day to day work look like.

๐Ÿ”นWhy are you using the tech stack you are using

๐Ÿ”นWhat is the alternative to Medallion Architecture

๐Ÿ”นWhat is the kind and size of data you deal with on daily basis

๐Ÿ”นIf the business is using JSON as the file format and you have to convince the business to use Parquet. How are you going to convince the Business?

PySpark Problem:

๐Ÿ”นGiven the Dataframe, how are you going to split the data into two columns ('Even', 'Odd'), where are even numbers are populated to Even column and Odd numbers are populated into Odd column.

Python Problem:

๐Ÿ”นGiven an array, find the min and max within the array.

SQL Problem:

๐Ÿ”นGiven a table with column 'Country', select the data as below sequence.

Table: Matches:

Col: Country

India

Australia

Pakistan

O/P:

India vs Australia

India vs Pakistan

Australia vs Pakistan

๐Ÿ”นGiven 2 tables as below. Find the count of records for Left Outer Join and Inner Join respectively:

A:

1

1

1

1

B:

1

1

1

๐Ÿ”นNote the values for DenseRank() and Rank() output for below data:

I/P:

85

85

80

75

75

70

Round 2 - Technical Interview

Work and Conceptual Related:

๐Ÿ”นTell me about yourself, kind of projects and tech stack used.

๐Ÿ”นExplain the architecture of Spark

๐Ÿ”นExplain the process how jobs run in Spark.

๐Ÿ”นSome follow up questions like what does Catalyst Optimizer do.

๐Ÿ”นWhat is the difference between Logical Plan and Physical Plan

๐Ÿ”นDifference between ORC and Parquet

PySpark Problem:

๐Ÿ”นRead a CSV file and create a dataframe with properties.

๐Ÿ”นCreate a dataframe with two columns with default String and default Integer respectively.

Python Problem:

๐Ÿ”นGiven a string, output the count of each word into a dictionary:

I/P: string = 'aaabbbccddeeeee'

O/P: Dict = { โ€œaโ€ : 3 , โ€œbโ€ : 3, โ€œcโ€ : 2, โ€œdโ€ :2 , โ€œeโ€ :5 }

๐Ÿ”นWrite a python program to count occurrence of an input string in a file . e.g. find the number of 

occurrences of the word โ€˜Theโ€™ in the sentence โ€˜The lazy fox jumps over the sleeping rabbit. The lazy 

rabbit doesnโ€™t wake upโ€™. has context menu

SQL Problem:

๐Ÿ”นGiven two tables, output the result of Inner, left, right, full Joins respectively.

Table1:

col1

1

1

Table2:

col1

b

a

1

Round 3 - Hiring Manager

Work and Conceptual Related:

๐Ÿ”นTell me the difference between Datalake and Deltalake

๐Ÿ”นWhy did you quit the job?

๐Ÿ”นEven though you have an offer in hand, why did you apply again?

๐Ÿ”นAre you willing to join if we offer same as the offer you have in hand

๐Ÿ”นAre you willing to relocate Bangalore, even though you are settled in Hyderabad. Why?